Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering
نویسندگان
چکیده
Agglomerative hierarchical clustering becomes infeasible when applied to large datasets due to its O(N2) storage requirements. We present a multi-stage agglomerative hierarchical clustering (MAHC) approach aimed at large datasets of speech segments. The algorithm is based on an iterative divide-and-conquer strategy. The data is first split into independent subsets, each of which is clustered separately. Thus reduces the storage required for sequential implementations, and allows concurrent computation on parallel computing hardware. The resultant clusters are merged and subsequently re-divided into subsets, which are passed to the following iteration. We show that MAHC can match and even surpass the performance of the exact implementation when applied to datasets of speech segments.
منابع مشابه
Document Retrieval using Hierarchical Agglomerative Clustering with Multi-view point Similarity Measure Based on Correlation: Performance Analysis
Clustering is one of the most interesting and important tool for research in data mining and other disciplines. The aim of clustering is to find the relationship among the data objects, and classify them into meaningful subgroups. The effectiveness of clustering algorithms depends on the appropriateness of the similarity measure between the data in which the similarity can be computed. This pap...
متن کاملProsodic and Phonetic Features for Speaker Clustering in Speaker Diarization Systems
This work is focused on speaker clustering methods that are used in speaker diarization systems. The purpose of speaker clustering is to associate together segments that belong to the same speaker and is usually applied in the last stage of the speaker-diarization process. We concentrate on developing proper representations of speaker segments for clustering. We realize two different speaker cl...
متن کاملClustering of EEG-Segments Using Hierarchical Agglomerative Methods and Self-Organizing Maps
EEG segments recorded during microsleep events were transformed to the frequency domain and were subsequently clustered without the common summation of power densities in spectral bands. Any knowledge about the number of clusters didn’t exist. The hierarchical agglomerative clustering procedures were terminated with several standard measures of intracluster and intercluster variances. The resul...
متن کاملImplementation of Hybrid Clustering Algorithm with Enhanced K-Means and Hierarchal Clustering
We are propose a hybrid clustering method, the methodology combines the strengths of both partitioning and agglomerative clustering methods. Clustering algorithms that build meaningful hierarchies out of large document collections are ideal tools for their interactive visualization and exploration as they provide data-views that are consistent, predictable, and at different levels of granularit...
متن کاملAlgorithms for Model-Based Gaussian Hierarchical Clustering
Agglomerative hierarchical clustering methods based on Gaussian probability models have recently shown promise in a variety of applications. In this approach, a maximum-likelihood pair of clusters is chosen for merging at each stage. Unlike classical methods, model-based methods reduce to a recurrence relation only in the simplest case, which corresponds to the classical sum of squares method. ...
متن کامل